Search CORE

6 research outputs found

Predicting Secondary Structures, Contact Numbers, and Residue-wise Contact Orders of Native Protein Structure from Amino Acid Sequence by Critical Random Networks

Author: Altschul S. F., Madden, T. L., Sch
Baldi P., Brunak, S., Frasconi, P.
CHANDONIA J-M
Crooks G. E. &amp
Kinjo A. R. &amp
Kinjo A. R. &amp
Kinjo A. R., Horimoto, K. &amp
Lee B. &amp
Li W., Jaroszewski, L. &amp
Nishikawa K. &amp
Pollastri G., Baldi, P., Fariselli
Rost B.
TATENO Y
Publication venue: 'Biophysical Society of Japan'
Publication date: 01/01/2005
Field of study

Prediction of one-dimensional protein structures such as secondary structures and contact numbers is useful for the three-dimensional structure prediction and important for the understanding of sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average,

Q_3

accuracy of 77.8% for SS, correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to other state-of-the-art methods, and that of the CN prediction is a significant improvement over previous methods. We give a detailed formulation of critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from amino acid sequence beyond that provided by the position-specific scoring matrices.Comment: 20 pages, 1 figure, 5 tables; minor revision; accepted for publication in BIOPHYSIC

arXiv.org e-Print Archive

Crossref

Improved residue contact prediction using support vector machines and a large feature set

Author: A Aszodi
A Lesk
A Murzin
A Ortiz
A Ortiz
A Valencia
A Vullo
B Rost
B Rost
B Schölkopf
D Bau
D Fischer
D Fischer
E Huang
G Pollastri
G Pollastri
G Pollastri
H Drucker
H Zhu
I Halperin
I Shindyalov
J Cheng
J Cheng
J Cheng
J Cheng
J Moult
J Moult
J Moult
J Skolnick
J Skolnick
J Vert
Jianlin Cheng
K Karplus
K Plaxco
M Punta
M Punta
M Vendruscolo
N Hamilton
O Grana
O Grana
O Lund
O Olmea
O Olmea
P Baldi
P Fariselli
P Fariselli
P Fariselli
P Kraulis
Pierre Baldi
PJ Kundrotas
R Bonneau
R MacCallum
S Miyazawa
T Joachims
T Joachims
U Goebel
V Vapnik
V Vapnik
Y Shao
Y Zhang
Y Zhang
Y Zhao
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved. RESULTS: Here we develop a new contact map predictor (SVMcon) that uses support vector machines to predict medium- and long-range contacts. SVMcon integrates profiles, secondary structure, relative solvent accessibility, contact potentials, and other useful features. On the same test data set, SVMcon's accuracy is 4% higher than the latest version of the CMAPpro contact map predictor. SVMcon recently participated in the seventh edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7) experiment and was evaluated along with seven other contact map predictors. SVMcon was ranked as one of the top predictors, yielding the second best coverage and accuracy for contacts with sequence separation >= 12 on 13 de novo domains. CONCLUSION: We describe SVMcon, a new contact map predictor that uses SVMs and a large set of informative features. SVMcon yields good performance on medium- to long-range contact predictions and can be modularly incorporated into a structure prediction pipeline

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks

Author: A Aszodi
A Aszodi
A Godzika
A Kryshtafovych
A Martin
A Schlessinger
A Vullo
A Zemla
B Rost
B Xue
C Mooney
C Venter
Claudio Mirabello
D Bau
D Jones
D Marks
D Pelta
E Krieger
E Lander
F Pazos
G Pollastri
G Pollastri
G Pollastri
G Pollastri
G Pollastri
G Shackelford
Gianluca Pollastri
Giuseppe Tradigo
H Zhou
I Ezkurdia
I Walsh
Ian Walsh
J Cheng
J Cheng
J Gorodkin
J Izarzugaza
K Han
K Kohlhoff
K Simons
M Pietal
M Punta
M Punta
M Reese
M Vassura
M Vendruscolo
N Qian
O Lund
P Baldi
P Baldi
P Fariselli
P Robustelli
Pierangelo Veltri
Predrag Kukic
S Altschul
S Griep
S Yooseph
T Hopf
U Göbel
U Hobohm
W Boomsma
W Kabsch
Y Shao
Y Shen
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Improved prediction of the number of residue contacts in proteins by recurrent neural networks

Author: G. Pollastri
P. Baldi
P. Fariselli
R. Casadio
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Improved prediction of the number of residue contacts in proteins by recurrent neural networks

Author: Baldi P
Casadio R.
Fariselli Piero
Pollastri G
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2001
Field of study

Knowing the number of residue contacts in a protein is crucial for deriving constraints useful in modeling protein folding, protein structure, and/or scoring remote homology searches. Here we use an ensemble of bidirectional recurrent neural network architectures and evolutionary information to improve the state-of-the-art in contact prediction using a large corpus of curated data. The ensemble is used to discriminate between two different states of residue contacts, characterized by a contact number higher or lower than the average value of the residue distribution. The ensemble achieves performances ranging from 70.1% to 73.1% depending on the radius adopted to discriminate contacts (6\uc3\u85 to 12\uc3\u85). These performances represent gains of 15% to 20% over the base line statistical predictors always assigning an aminoacid to the most numerous state, 3% to 7% better than any previous method. Combination of different radius predictors further improves the performance. \uc2\ua9 Oxford University Press 2001

CiteSeerX

Archivio istituzionale della ricerca - Università di Padova

The MULTICOM toolbox for protein structure prediction

Author: A Fiser
A Fuchs
A Kryshtafovych
A Porollo
A Roy
A Vullo
A Zemla
A Šali
AK Dunker
AN Tegge
B Monastyrskyy
B Monastyrskyy
B Petersen
B Rost
B Rost
B Wallner
BD O’Connor
BG Fox
C Cole
CL Worth
D Baker
D Baú
D Cozzetto
D Cozzetto
D Eisenberg
D Frishman
D Frishman
D Gilis
D Xu
DB Roche
E Capriotti
E Faraggi
F Ferre
FC Bernstein
G Karypis
G Lin
G Pollastri
G Pollastri
G Shackelford
H Berman
H Zhou
H Zhou
I Ezkurdia
J Cheng
J Cheng
J Cheng
J Cheng
J Cheng
J Cheng
J Dai
J Eickholt
J Kendrew
J Liu
J Moult
J Moult
J Moult
J Peng
J Sim
J Soding
J Xu
JD Thompson
JE Gewehr
Jesse Eickholt
Jianlin Cheng
Jilong Li
JJ Ward
JL MacCallum
JMG Izarzugaza
K Karplus
K Karplus
K Shimizu
K Shimizu
K Simons
L Kinch
L McGuffin
L McGuffin
L McGuffin
LM Iakoucheva
M Paluszewski
M Perutz
M Punta
M Wagner
MJ Mizianty
O Zimmermann
P Baldi
P Baldi
P Björkholm
P Bradley
P Chen
P Fariselli
P Larsson
R Adamczak
R Adamczak
RL Marsden
S Hirose
S Singh
S Wu
S Wu
T Ishida
T Zhang
TZ Sen
V Mariani
V Parthiban
W Kabsch
X Deng
X Deng
X Deng
Xin Deng
Y Li
Y Yang
Y Zhang
Y Zhang
Y Zhang
Y Zhang
Y Zhang
Z Dosztányi
Z Dosztányi
Z Wang
Z Wang
Z Wang
Z Wang
Zheng Wang
Publication venue: BMC
Publication date: 01/04/2012
Field of study

Abstract Background As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources. Results To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction. Conclusions These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at <url>http://sysbio.rnet.missouri.edu/multicom_toolbox/</url>.</p

Crossref

Directory of Open Access Journals